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R. Nathan 


Image Processing via VLSI 

Abstract 

The general purpose digital computer is not able to handle the data 
rates and subsequent throughput requirements of data systems in the mid- 
80's and early 90's. In particular vast quantities of image data will 
have to be calibrated, geometrically projected, mosaicked and otherwise 
manipulated and merged at rates that far exceed the capacities of present 
systems. Even the "super" computers, some of which have been designed 
explicitly for image processing, promise insufficient throughput capacity. 
Implementing specific image processing algorithms via Very Large Scale 
Integrated systems offers a potent solution to this perplexing problem. 

Two algorithms stand out as being particularly critical — geometric map 
transformation and filtering or correlation. These two functions form the 
basis for data calibration, registration and mosaicking. VLSI presents 
itself as an inexpensive ancillary function to be added to almost any 
general purpose computer and if the geometry and filter algorithms are 
implemented in VLSI, the processing rate bottleneck would be significantly 
relieved. This work develops the set of image processing functions that 
limit present systems to deal with future throughput needs, translates 
these functions to algorithms, implements via VLSI technology and inter- 
faces the hardware to a general purpose digital computer. 

Objectives 

* Design and fabricate special purpose VLSI chips to perform specific 
image processing algorithms. 

* Integrate such chips into interface systems which are under the 
control of a central general purpose processor assigned to image 
processing. 
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* In particular, design and test filter system and a cubic spline 
geometric reprojection system. 

* Examine and develop VLSI design concepts for other image processing 
requirements. 

Motivation 

Bracken (1) has spelled out the need for improving the processing 
speed of image data collected from an ever increasing array of satellites 
each with a larger information bandwidth than its predecessor. The proces- 
sing problem has several dimensions. 

* In order to enable the end user to use new information, the data 
must be restructured to match a previous information base. A com- 
mon requirement is to reproject and register images taken from an 
oblique satellite view to a normal projection on the surface. This 
reprojection along with the need to correct for any systemmatic 
camera distortions requires that images be stretched like a "rubber 
sheet" to fit the desired reprojection. This shift which entails 
many rotations and magnifications within each image requires relo- 
cation of interpolated data to locations which may be far distant 
from some original position. 

* In order to determine the precise shift which will bring two images 
into registration, match points must be determined. Modified cross 
correlation calculations can be used to maximize the best fit of 
these match points. Correlation and filtering have similar mathe- 
matical structure and both can be implemented with a special purpose 
VLSI system. The filtering operation is also used to smooth noisy 
data or enhance fine image detail. Image enhancement has been 
applied rather infrequently in spite of image improvement because 
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it is an expensive process. VLSI operation can reduce the cost and 

\ 

time of processing. Filtering also enables certain feature detection 
and extraction algorithms. 

* Another dimension of image processing relates to where in the data 
stream the processing is to be performed. Our present technology 

vthus far requires transmission of unprocessed images. As high speed 
compact processing technology evolves, processing can be movdd on 
to the satellite and transmission bandwidth reduced by several 
orders of magnitude. 

* Pipeline processing implies placing simultaneous hardwired algorithms 
in tandem. Other image processing functions such as sorting maximum 
values, change detection, developing time dimensions on accumulated 
data bases become accessible in near real time when thinking in 
terms of modular hardware. 

Background 

Digital image processing became a working reality in the early 1960's 
with the advent of JPL's Ranger, Surveyor and Mariner series. We (Nathan-2) 
had effectively established the requirements for various processing algo- 
rithms from pragmatic pressures. Filtering was performed to remove system- 
matic noises from the camera and geometric corrections also were required 
to correct camera distortion. Filtering further evolved to enhance fine 
image detail without stretching low frequency data to cause image satura- 
tion. In those early missions it was generally possible to keep up with 
the data load with the processing power of computers available at the time. 

No real attempt was made to do more than refine those algorithms using Com- 
mercially availaHe machines. Since that time the situation has dramatically 
changed in terms of volume while the algorithms have remained relatively 
static 
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Several attempts at creating special parallel processors have proven 
expensive and unwieldy. A comparison of several "super" computers was 
performed by Mitre Corporation (3). They were given several classes of 
very limited tests against which to measure processing effectiveness. 

Some of the computers compared were the Cray I, the DAP (English), the 
PEPE (Army), the Illiac IV, the Cyber 203, the AP-120B, the CLIP 3 and the 
MPP ( Goddard- Goodyear) . All but the AP-120B are extremely expensive 
(several millions of dollars each) whereas the AP-120B is very much slower. 
Mitre judged the MPP to be the best as determined from the given conditions 
But the filter and geometric test problem was much too constrained and 
just fit the 128x128 area of parallel memory in MPP. Only a kernal of 
20x20 can be filtered against a 128x128 image. Only a shift of 8 pixels 
using linear interpolation is allowed for geometric remapping. These 
restrictions have been hardwired and only slow software can overcome them. 
The heart of the MPP is a general purpose VLSI processing unit. The 
direction of the concept is still in terms of multiple function perform- 
ance by a particular hardware unit. 

VLSI is a general tool which can be viewed as an extension of soft- 
ware in the sense of the next generation of computing power. These con- 
cepts have been under development at Caltech under Mead (4). JPL has a 
very close relationship with the campus. We have recently been working 
with Mead to aid in the rapid evolution of the software techniques for 
designing VLSI circuitry and have, in addition, been developing filtering 
hardware concepts following a data flow algorithm from Cohen (5) which 
allows successive multiplier-accumulators to be pipelined. A modification 
in memory handling allows an extension to two dimensions and is being 
breadboarded to fit the VLSI design. As a seed effort we have started to 
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design a VLSI chip which will allow us to create a 31 j< 31 element kernal 
that will compete favorably timewise and dollarwise against the MPP. 

Approach 

VLSI design is still a rapidly evolving field. Computer languages 
are under development which will eventually allow high level statements 
to be made which establish functional criteria and these statements -would 
be converted to n-channel metal oxide semiconductor (n-MOS) or complemen- 
tary c-MOS wire lists. These lists are in turn converted by computer to 
drawings of different overlays of metal and metal oxides. The drawings 
are then photo-reduced and photo-etched onto silicon or sapphire wafers 
which are then cut and wired to form individual chips. 

The amount of logic that can be placed on a single chip is also 
evolving rapidly. Today many tens of thousands of transistors can be 
stored on a surface 7x7 mm sq. Within three to five years that number is 
expected to increase by 2 to 3 orders of magnitude. At JPL we are experi- 
menting with ways to develop languages which will allow variation of 
parameters, number of multipliers, number of bits/multipliers, serial or 
parallel additions/multiply and other parameters which will allow us to 
tailor fit to customer need without massive redesign effort. 

As we contemplate the marriage of VLSI technology with image proces- 
sing requirements, not all the pieces are yet in place. Some of the 
designing effort is still initiated by manual drawings to meet VLSI design 
rule requirements. The logic for multiplication is still not finalized 
as competition for area (on the chip) and speed (minimum clock cycles 
per multiply) is under study. Conceptual design for the geometry opera- 
tion is under rapid restructuring as various experts are consulted 
(Billingsley-6). Projects like these are studied by Caltech students in 
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Prof. Mead's classes and valuable interchange is derived from those dis- 
cussions. The whole idea is to be able to upgrade design concepts and 
create new VLSI chips as though debugging computer software. 

In parallel with the actual chip development hardware is being 
developed to interface the VLSI to existing computer structures. A not 
too surprising result emerges as this effort progresses. VLSI allows an 

m 

improvement in throughput over a serial general purpose computer by a*' 
factor of 2 to 3 orders of magnitude. We very quickly become I/O bound 
in terras of magnetic tape or disk. Consideration must be given to grab- 
bing the data once from mass (serial) storage, and performing all processes 
at once (pipeline serial) before sending the restructured data or extracted 
information back on to tape or out to the customer. 

We have spent some time with the initial development of a VLSI chip 
which presently has four multipliers each of which stores 20 bits and 
multiplies an 8-bit pixel by a 12-bit weight. The chip has been submitted 
for fabrication external to JPL. Turnaround is about two months. JPL's 
role is not to compete with the commercial fabrication process, but we 
are more interested in developing more versatile VLSI design tools. 

Some effort has gone into the concept of a pipelined geometry remap- 
ping chip. An initial concept designed by us (Nathan) was tried success- 
fully by Northrup for the Air Force. But that was only a nearest neighbor 
design. We have developed many software algorithms over the years, and 
recently thought is being given to a four point modified cubic spline 
which should not degrade the image as does nearest neighbor or linear 
interpolation. The concept is to perform two orthogonal stretches (or 
contractions) along each axis as serial operations (pipelined in two 
VLSI functions) while have direct access to several megabytes of random 
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access memory from the computer main frame. The proposed speed of trans- 
formation is many times that presently available. 

Expected Results 

Two sets of hardwired algorithms are to be produced. One algorithm 
will perform two dimensional filtering or correlation on an arbitrarily 
large image using a 31x31 kemal (at present design — a modifiable 
parameter). The other algorithm is a pair of one dimensional cubic spline 
geometry remapping functions which under software control will "rubber 
sheet" one image to another according to pre-established pass points. It 
is expected that these systems will be installed for use in OPL's Image 
Processing Laboratory (IPL) and be used in their image processing produc- 
tion mode. 

Progress is anticipated in the development of software which utilizes 
the filter hardware to establish "pass points" and these in turn will 
generate the correction grid for the geometry hardware. 

Also investigation into class extraction using the filter hardware 
will be started. Studies of this sort exist as software only. It is 
desired to explore increased dimension of class search once a fast hard- 
ware filter becomes available. 

Another product which can be expected is the ability to reproduce 
other VLSI configurations with minor changes in design parameters. This 
ability gives us the power to update new hardware without major mechani- 
cal redesign as customer needs change. 

As concepts develop regarding the utility of other imaging operations, 
these too shall be pursued. 


Z60 



References 


1. Bracken« P.A. (1980) - "Earth Resource Observations Data Systems 
in the 1980's" in 1980 Annual Meeting of American Astronautical 
Society and AIAA Paper 80-240. 

2* (TSSe) - "Digital Video Data-Handling," Technical Report 


MITRE Corp. (1979) - "Parallel Processor Technology Trade-off Study 
for the NASA End-to-End Data System (NEEDS) 


4. Mead, C.A. and Conway, L.A. (1980) - "Introduction to VLSI Systems." 
Addison-Wesley pub. 


2; ■ "Mathematical Approaches to Computational Networks,' 

I51/RR-78-73 Information Sciences Institute/USC. 


6. Billingsley, F.C. (1981) - "Modelling Misregistration and Related 
Effects on Multi spectral Classification," JPL report in press. 


261 



262 







I960 1970 1930 1990 2000 

YEAR 


cc 

o 

o 

4 

X 

10^ 

UJ 

s 

p 

10^ 

> 

ff 

Ui 

> 

m 

10* 

Q 

g 

1o’ 

2 

10 


RAPID DATA DELIVERY 

• ^(8 MONTHS) 

WEEKS) LAND6TAT*1-2 
>« 12 WEEKS) LAN0STAT*3 
\ <2 DAYS) tANDSTAT»D 


19C0 1970 1900 1990 2000 
YEAR 



a 


















SVSTIMATtC 

1I-7M 


COLOR 

CR1 


COLOH 

CRT 


B/W 

CRt 




























KCP 9 Ci:»'.Van« 




KC«^ 


/^ca 

NCAV-Kseo 

Uty»»: 70r«1 (fiCA) 

Ur**ec 10n{PO) 

UrwceXSCS 

(fiM37Qfl45 

U*rteTO^(nCA) 

UwtK 7a*3(RCA) 

C£CIC£5(2X) 

.COCO«?CA^»^ 

,Megrucani>Ci>3 

I 

UnnacSOUTO 

8wrowQA*4700 

eurro».V«^3}30 

iUrav&e »7CO 

;Hor>9y««ilC»lO 

,N«xC«»C».a6333 

• Bu*fCv«3?'«C<;03 

Ncav-^a^o 

jUhivficttCOIl 

junnoctica 
I»^ac4t9.|il 
Hcn«yY^o]e2Xe(2X) 
;NCav-:i3S54 
.6u*W.^*'«C7C0 
,IG.‘J370'K3 
*M>CrvJonU::>3t 
•8u/fooyv3 C.X3 

• BofrcvCf^cCCS 
jOECZCtO 

:Mofvjy»o:iOPSS/2« 

•OCC tCCO 
‘CfClC/OKI 
.CCCPC^n/6a 
COCI71 

^Mcrvt7ni'jaCVtO(2XJ 
!Miyiiuaortf~ia4 
fcif<?x.ccn ?.:?3^22 
NCavCi55:.P 
jMor«Y*siCaor 

'SuffOyQfo -307 
:u>v.-)ciio*oci 
.0o/ro.-V4 3X3 
.ievi370riss 
NinoCiaCiiCT *303 

;COCOv.*«>» 450^11 

U^«liC08lCl 
IBM 3 X 88 
,Urv«oc 1t06<ll 

NCSV-367S 

NAS4S* 

OCCPGPil/70 

r»C*0P6(2Xj. 

.at XSO-2 

'<Jr<«ae • « 0 ai • I 0 v«.l 09 > 
‘ 0 *^>ar ^ 9 ^ 

V'».^ ••0O6'C2 

6g C 2 ■ ■ 


SaZT 


2B1 

121 HxkoyvtaCVir 

700 

287 

122 i&uoocm 

— TOO 

2S7 

139 Ma\»ya«l CPS 8/44 

no 

70S 

134 6onoughii340 

714 

XA 

13 C€C1077(2X) 

744 

300 

13 10744341 

— 7C4 

300 

127 Ur^/3C11C6 

TOO 

M7 

123 Bomoug7Uu3lO 

7C5 

331 

119 Bumsocl'ACail 

7« 

321 

tZO Buroufif>sCat2 

Ttft 

32t 

121 NC«V-aS35M 

77» 

340 

132 Urwtc 0080-3 

eoo 

340* 

138 UnvocOOOO 

ea 

340 

184 OeClCOOKL 

ea 

344 

138 0€C1CB0 

622 

330 

136 DEC 2060 

C29 

300 

197 lCM37(ytS8 

— CIO 

332 

i:9 NA3AS/S-I 

oea 

333 

132 CEcVAX-1 1/760 

<—631 

302 

KUonutonKidO/42 

834 

4C0 

141 Burrouc^4^^ 

M5 

400 

143 e4Xrou0ft»7750 

645 

406 

148 Nanod.>.uCMX6348 

cao 

424 

144 NCRV-8575MP 

664 

423 

*143 Burroughs 7005 

900 

423 

148 Horayv:.«;i 6S/40 

900 

430 

147 IBM37(V1Sd-8 

900 

450 

149 NASAS5-3 

900 

45S 

1*12 COC 4SO*UI 

960 

4<2 

119 V^nuSonAt80/48 

ces 

473 

191 COC 72 

laOOO 

4S$ 

132 NASAS/4MP 

1.000 

437 

158 Hoooyw«il€&20(2X) 

t.oca 

310 

154 Un.v9cllOCV12 

1.044 

520 

159 I6.VI0C31 

1.045 

525 

15lUo<vactl(X/4l 

1.0» 

531 

157 UoivscSG aM 

1.100 

531 

113 Honaywaii 6(V27* 

1.120 

531 

113 Uo«3c I10C.-6IH1 

1.120 

540 

1S8 UnivscM 10(1X1) 

1.143 

244 

*51 3o«Ot-<3hs£ai7 

1.147 

544 

119 BufroMgltsCOlS 

1.150 

545 

118 Univoc 1 1COCOH1 

1.1S6 

550 

154 0eCl059(2X) 

1.160 

550 

1C8 C£ClCOa(2X) 

1.180 

553 

1-19 Mor^O)n^^^H^ OPS 3152 

1.2C0 

560 

147 C(Xl72 

1.200 

560 

1C3 Bo/fo«jQKse32i 

1 260 

564 

139 Buffot/gn* 6822 

1.260 

571 

•170 Hon9yw«ae660 

1.206 

576 

171 Univec IIOOMP 

1.290 

595 

172 au»fO»rg«» 7755 

1 300 

600 

173 C(X73 

1 300 

600 

178 (“orvjyw-ia £& CO 

1.3C9 

COO 

179 NCRV-3535».IP 

1 340 

614 

178 Cnivac I1C061H2 

V344 

660 

177 NASAS5-1 MP 

1.407 

654 

178 Or»ac 110050H? 

1 496 

672 

178 Vjrwac liCX>62Et 

1.496 

5CO 

._l»_^'OLSta77r5t 

1 526 


W1 HAaAS/5-3MP « S?^ 

132 Hon«r»04 664O(2X) Ij 

1&2 Ur»<e«llCOiS2 ITU 

134 UnV«clt0l3tl -• 1.800 

iaCOCl73 1.870 

138 !BM37(yiS6 1.900 

1S7 Uncvte 1 100142 1.918 

139 Utwac 11 10(2X2) 1.»43 

U2 BwTOooh»7770 I.S60 

1» Her>r^Cf»Sai70 1887 

1*1 BurroughtTBIt 2.100 

m lEMOOOrftS 2.100 

1C9 Unvac1lOC/e2H1 2.244 

1d« Hcr>«yws«6&60 2.278 

ia 1043701168 2.300 

1:8 Hon»y>*«a6&00(2X) 2.3<0 

107 Dorrcuch4 77W 2.13) 

133 AflirJiW 2.437 

1» COC74 2.SM 

3:9 coceeoo 2 .SC 0 

ai IBM 370- 168-3 2.500 

203 IBM 3032 2.500 

ao BufrouQht 7780 2.535 

304 NCflV-eeSO 2.6SO 

208 Ufwac 110043 2.739 

233 Urvvac 110062 2.000 

ta» C0C174 2.0)5 

209 AnyjVii470V-5-li 2.850 

2C9 Burrougnt 7775 3.0C0 

210 NASAS8 3.09^ 

211 COC76 312l( 

212 UrxvBC 1 1 10(4X4) 3.303 

213 Uoivx 110082 • 3 3C0 

314 A4-<!^I470V6 3.450 

218 Hork3>x«H()PS870 3.57C 

218 Uoivac 110044 3.6i5 

217 COC8700 3.7C0 

219 AmcUhi470V6-i« 3 710 

219 Amc!a«470V-7a 3.S5 

210 Bufrougris 7785 3.1C0 

231 BufrouOM 7821 4.0CO 

£22 I0M 3033N 4.000 

ArTvJanl470V 7A 4.250 

£M NCflV-efi70 4 233 

228 IBM 370 195 4..»50 

Ztr Hon«yw«u OPS 8 70i3X) S.007 

218 Uo<v#c 110083 — 5040 

229 COCI7S 5 060 

220 '9M3033U 5.900 

331 Amdam47ov-7 5 950 

73t A(PCani470V8 8 375 

233 Un<vac < 1X84 . 6 400 

124 Hon«v'»^-iCPSH ^jUXi 6510 
228 COC <76 9 360 

220 COC 7600 *0 0X 

237 COC C»Ce* 204 

900 00C 

228 C'ay I S«»*tio»; mm 900 Ot*^ 




265 






VLSI 

VERY URGE SCALE INTEGRATED SYSTEMS 
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DIGITAL MULTIPLIER 

• TYPICAL 1C MULTIPLIERS CONTAIN REGISTERS FOR MULTIPLIER AND 
MULTIPLICAND OPERANDS. 

• AN ADDITIONAL REGISTER IS PROVIDED FOR THE PRODUCT. 

• MODERN LSI MULTIPLIERS PERFORM ADDITIONS IN A RIPPLE FASHION. 
THAT IS. EACH SUM IS PASSED ON FROM ONE ADDER TO THE NEXT WITHOUT 
THE USE OF CLOCKED SEQUENTIAL CIRCUITS. THEREFORE. THE 
MULTIPLIER IS COMPRISED OF AN ARRAY OF GATES AND ADDERS. 
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GEOMETRIC REPROJECTION 
RESAMPLING 



N “ p.j + P 2 ^2 * ^ ^4 ^^4 


FRACTIONAL 
DISTANCE 
BETWEEN POINTS 


FOR LINEAR INTERPOLATION OF NEW SAI'IPLE VALUES 

. W, = W4 = 0 

Wg - 1 - f 


THEREFORE 


“ ^2 ( 1 -f^) + P 3 



FOR CUBIC INTERPOLATION ALL FOUR Mi ARE A TABLE LOOK UP 
FUNCTION OF f^.. 

THE MEW INTERSAMPLE DISTANCE (d) CAM ALSO BE NONLINEAR. 
ALLOWANCE IS MADE FOR CUBIC SPLINE ADJUSTMENT FOR NON-LINEAR SAMPLING. 
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FUNCTIONAL ELEMENTS OF CAMPUS DESIGN SYSTEM 



Broker 


Notes: 

1. At JPL^ system will 
RUN USING MAINSAIL 
COMPILER ON VAX 
11/780 WITH VMS 
Operating System 

2, Switch-Level 

simulator is based 
on hit work of Bryant 
AND Temans. 
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FUTURE 

INDUSTRY DOES HOT WISH TO DEVELOP CUSTOM CHIPS 
INTERESTED IN HASS MARKET 
JPL AND OTHER USER INSTITUTIONS NEEDED TO 
DEVELOP HIGH LEVEL SOFTWARE TO CONVERT 
ALGORITHMS TO PARALLEL HARDWARE 
POTENTIAL APPLICATION AREAS 
PATTERN EXTiOiCTION 
SAR 

I/O PARALLEL DATA FLOW ; 



OTHER DATA BOHLENECKS 
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FUNCTIONAL ELEMENTS OF CAMPUS DESIGN SYSTEM 



Fabrication 

Broker 


Notes; 

1. At JPL, system will 
run using mainsail 
compiler on VAX 
11/780 with VMS 
Operating System 

2. Switch-Level 

simulator is based 
on hit work of Bryant 
AND TeMANS. 
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